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In this paper we provide a rigorous proof that feedback cannot increase 
the capacity of the channel with additive colored gaussian noise by more 
than a factor of two. We also give a tighter bound showing that any increase 
in capacity is less than the normalized correlation between the signal and 
noise. It is further shown that gaussian signals and linear feedback process- 
ing will achieve capacity. 

The practical implications are that (i) feedback should be used to simplify 
encoding and decoding since there is little to be gained in the way of in- 
creased capacity and (ii) the various proposed schemes which use linear 
feedback are doing the correct thing. 

I. INTRODUCTION 

When Shannon first showed that feedback could not increase the 
capacity of a memoryless channel, he mentioned that the capacity 
could be increased when the channel had memory. 1 One example of 
such a channel is the additive colored gaussian noise channel with an 
average power limitation on the transmitted signal. We prove here 
that the capacity of this channel is never more than twice the capacity 
without feedback and as the noise becomes white the capacity ap- 
proaches the forward capacity. The limiting case has been attributed to 
Shannon for years and has only recently been rigorously proven. 2 

We derive an exact expression for the mutual information between 
the input and output of the channel. The application of different bounds 
to this expression produces twice the forward capacity with the weakest 
bound, or the forward capacity plus the normalized correlation of the 
signal and noise with a slightly stronger bound. It is shown that a 
gaussian signal maximizes the information, and consequently the opti- 
mum feedback technique is linear. 

Our results are based on the model shown in Fig. 1. The added noise 
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Fig. 1 — Channel with noiseless feedback. 

spectrum is normalized to 1 at infinite frequency, is bounded, and has 
an integrable logarithm. This allows us to represent the noise as in 
Fig. 2. The noise now consists of a white component plus a filtered 
version of the white noise. The imposed restrictions are for mathematical 
purposes only and are of no practical significance. 

Theorem 1: The mutual information between the in-put and output 
of a channel with additive gaussian noise with spectral density N(u) and 
arbitrary causal feedback processing, as shown in Fig. 1, is given by: 

I(m; Y T ) = I f T E 2 [s(t) + z(t) | m, Y t ] dt 

£ Jo 

-\f*E\ S {t)+z{t) | Y t ]dt (1) 

where Y t is y(r), ^ r < t and the expectations are conditioned on Y, 
or Y t and to. z(i) is a linear causal functional of white noise with the 
properties that: 

z(t) = [ h(t- r) dw(r) + ( h(t + r) dv(r) 

Jo Jo ^) 

| 1 + H(o) | 2 = N(u). 

The two functions iu(i) and v(t) are independent Wiener processes. The 
reason for introducing the second term is to make n{t) = z{t) + w{t) a 
stationary process. 
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Fig. 2 — Model of nonwhite noise. 
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Proof: We first observe that w{t) + z(t) is equivalent to noise with 
spectral density N(ui). A causal filter, h(r), will exist whenever N(a)) 
represents the square magnitude of a causal filter 

| G(u) | 2 = N( u ) 

tf(co) = G(<a) - 1. 



The logarithm of G(w) is 



£ In N(u) + iB(u) 



where B(co) is the phase characteristic of G(co). The conditions of cau- 
sality, no lower half plane poles, will be met when B(a)) is one half the 
Hilbert transform of In N(u>). The conditions on N(w) insure that 
In N(ui) has a Hilbert transform. 

Now to prove formula (1) we use a theorem due to Kadota, Zaki 
and Ziv 2 , which we state without proof: 

Theorem A: The mutual information between the input parameter 
m and the output processes Y T of a finite power system disturbed by addi- 
tive white gaussian noise is 

/(to; Y T ) = \B I <t>\t, m, Y,) dt - \E \ E\4>(t, m, Y,)/Y,] dt, 

Jo Jo 

where <f>(t, m, Y t ) is the causal modulating function. 

This result is applied to the non-white noise problem by considering 
z{t) to be part of the signal. The inclusion is only useful when one is 
calculating the mutual information; it is not to be included in the 
calculation of transmitter power. Theorem A cannot be applied directly 
since the signal, <j>, which is taken as s(t) -+- z(i) is not completely de- 
termined by m and Y, , but is also a function of the process v(t). To 
find I(m; Y T ) we use the decomposition, 

I(m, V; Y T ) = 7(m; Y T ) + I(V; Y T \ m), (3) 

where V is the process v(t). 
From Theorem A we have, 

I(m, V; Y r ) = \E [ [s{t) + *(«)]' dt 

•'o 

- IE f E 2 [s(t) + z{t) | Y t ] dt (4) 

•'0 
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and 
7(7; Y T \m) = \E { [s(t) + z(t)] 2 dt 

Jo 

- \E f E 2 [s(t) + z(t) | Y t ,m\dt, 

which together with equation (3) proves Theorem 1. s(t) + z(t) has 
finite energy because s(t) must have finite energy and z{t) will have 
finite energy whenever the channel has finite capacity without feed- 
back, as we shall see when we evaluate E[z 2 {t)\. With this basic result 
we can derive several interesting corollaries concerning the information. 

Corollary 1: (Pinsker)* Under the conditions of Theorem 1, 

Km; Y T ) 
T = 

where C is the capacity at the channel without feedback. 
First we observe by equation (3) that 

I(m; Y T ) ^ I(m, V; Y T ) 

which is given by equation (4). Furthermore the second term in equa- 
tion (4) is negative and can be ignored, thus 

Km; Y T ) fS \E f (s + zf dt. (5) 

I(m; Y T ) can be further bounded by 

Km; Y T ) ^E f s 2 dt + E f z 2 dt (6) 

Jo Jo 

since (s + z) 2 ^ 2s 2 + 2z 2 . 

The next step is to calculate the variance of z, since this enters di- 
rectly into I(m; Y T ). 

E f z 2 (t) dt = TE(z 2 ), 

Jo 
E(z 2 ) - ~ f° I H(a>) I 2 dco = ~ f° I (?(co) - 1 I 2 du> 

lit Joo -7T .'» 

= h / J exp [i ln N(fa) + 1 ln N(fa) j ~ * 



* The factor of 2 has been mentioned earlier by Pinsker but no proof has yet 
been published. 

t Indicates the Hilbert transform. 
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- ==± r [i - # («)] ^ 

- ~ f jexp [| In N(u) + | In tf («)J - l| dot. 

This latter integral, as chance would have it, is almost identical in 
structure to an integral which arises in evaluating the spectral density 
of a single sideband FM wave (at the carrier frequency) which is 
modulated by a gaussian signal. The quantity 1/2 In N(u>) here plays 
the role of the autocorrelation function of the gaussian signal, and 
although for our problem 1/2 In N(u) is not in general an autocorre- 
lation function, the integral may be discussed via the technique used 
in the FM problem (see Mazo and Salz) 3 . 
Define: 

~ f [| hi tf(«) + | In N^)\e ial = /(*) 
then 

I [<?<„) - 1] = GM £ F(„) = [(?(*) - 1] £ F(») + j- u F(„). 
In the time domain this becomes 

-ilh(l) = -ilj(t) - i [ Tj(r)h(t - t) (It 

Jo 

because both h(r) and /(t) are zero for negative r. Both /(t) and h(r) 
are finite for small r and thus 

Mr = 0) = /(r = 0). 
The integral we are interested in is 2 Re A(t = 0) which is equal to 

2Re/(r = 0)=f [ In #(«) rf«. 

-£7T .'-00 

Thus far we have shown that 
A' ( s 2 dt + E f z 2 (it 

Jo Jo 

= E f s 2 (it - £- r [1 - N(a)] aw - ~ [ In N(u) da. (7) 

One more trick is needed to prove the corollary. We have, up to this 
point, considered only normalized channels which had N(<*>) = 1. 
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This is valid because normalization cannot affect the ratio between 
capacity without feedback to that with feedback. Some channels can- 
not be normalized in this manner, i.e., JV(<») = » or N(«>) = 0. 
The latter case has infinite capacity and thus the corollary applies. 
The former presents no problems due to the following lemma. 

Lemma: Consider ike channel without feedback. By the water pouring 
argument* we know that the signal energy which achieves capacity obeys: 



[0, otherwise. 

// we define a new noise N («) 

[K, N(u) > K. 

This new channel has the same capacity without feedback and a larger 
capacity with feedback. 

Proof: The expression for capacity without feedback is the same 
for N(u) and N°(u)). The capacity with feedback can only be increased 
since N°(a>) ^ N(u) for all u. For if the capacity with N(ui) were larger, 
one could add a noise with spectrum N(u) — N°(u>) at the receiver 
and do just as well as if the noise were N(u>). 

We now normalize the noise, iV°(co), in order to apply equation (6), 
which makes K — 1. The capacity without feedback is: 



c = hL ln w^) du ' 



(8) 



With feedback from equations (6), (7) and (8) 

/(to; Y t ) ^E f s 2 dt - TP + 2TC 

Jo 

or 

Km; Y T ) 9r 
T 

A tighter bound can be obtained by returning to equation (5) and 
writing: 

/(to; Y t ) g | \e J s 2 dt + E J z 2 dt\ + E J sz dt, 
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which by the preceding argument is equal to 

C + E f sn° dt. 

Jo 

The correlation Esz° is equal to Esn° because n° and z° only differ by a 
white component. Thus the capacity can be increased only by the 
correlation of the signal with the noise. The noise n° is not the original 
noise, however the difference occurs only at frequencies not used for 
signaling without feedback. As N(w) becomes white, the energy in z° 
decreases and consequently Esz° must go to zero. 

More insight into the problem is supplied by the following theorem. 

Theorem 2: Capacity can be attained with a gaussian signal s(t). 

Proof: First we observe that 

E[s(t) + z(t) | m, Y t ] - s(t, m, Y t ) + E[z(t) \ W t \. 

This is true because s(t) is known given m and Y t , and z(t) is dependent 
on W, which can be calculated given Y, and s(t). E[z(t) \ W t ] is a linear 
functional of w because w is gaussian. 



E[z(t) | FT,] = f K(t,r)Mr). 

Jo 



The first term in equation (1) depends only on the correlation prop- 
erties of s(t, m, Y t ) and w(t) and therefore we can use a gaussian s of 
the appropriate correlation. For the second term we use the property 
that a least-squares linear estimate has no more energy than the more 
general least square estimate. 

Ex" = Ex 2 + E(z - xf = Ef + E(x - £) a 

where .? is the least-square linear estimate of x and x is the least-square 
estimate. Since 

E(x - i) 2 g E(x - xf, 

Ex ^ Ex 2 . 

Therefore, since E[s(t) + z(t) \ Y t ] is the least-squares estimate of 
s(t) + 2(0 given Y, we have 

T(m ; Y T ) ^ ~ E f E\s +z\m, Y t ] dt -\e f (s + z) 2 dt 

Z Jo ^ Jo 

but for a gaussian signal this inequality is an equality. In addition the 
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signal power is unchanged and the feedback processor need only be 
linear. Therefore one need consider only gaussian input and linear 
processing in calculating capacity. 

II. GENERALITY OF THE MODEL 

The restrictions on N(u>) are in fact only needed for N (u). If a 
noise spectrum is such that the logarithmic integral of N°(a)) is minus 
infinity then the capacity of the channel is infinite without feedback. 
Therefore the bound applies to any channel which has a finite capacity 
without feedback. 

The bounds are all valid for noisy feedback as well, however it is not 
clear that gaussian signals are optimum in that case. 

III. ACKNOWLEDGMENT 

The author is indebted to J. Salz and J. Mazo for helpful discussion 
and in particular the evaluation of the integral of | H{u) | 2 . 

REFERENCES 

1. Shannon, C. E., "The Zero Error Capacity of a Noisy Channel," IRE Trans. 

Inform. Theory, IT-2, No. 3 (September 1956), pp. 8-19. 

2. Kadota, T. T., Zakai, M., and Ziv, J., "On the Capacity of Continuous Memory- 

less Channels with Feedback," to be published in the IEEE Trans. Inform. 
Theory. 

3. Mazo, J. E. and Salz, J., "Spectral Properties of Single-Sideband Angle Modula- 

tion," IEEE Trans, on Comm. Tech., Com-16, No. 1 (February 1968), pp. 
52-62. 

4. Shannon, C. E., "A Mathematical Theory of Communication," B.S.T.J., 27, 

No. 4 (October 1948), pp. 623-656. 



